Back

Journal of Genetics and Genomics

Elsevier BV

Preprints posted in the last 30 days, ranked by how well they match Journal of Genetics and Genomics's content profile, based on 36 papers previously published here. The average preprint has a 0.07% match score for this journal, so anything above that is already an above-average fit.

1
MLL3/4 methyltransferases regulate the differentiation of pluripotent stem cells via cellular respiration

Nur, S. M.; Jia, Y.; Ye, M.; Lepak, C. A.; Ben-Sahra, I.; Cao, K.

2026-03-26 developmental biology 10.64898/2026.03.24.713976 medRxiv
Top 0.1%
6.0%
Show abstract

Enhancer-regulating epigenetic modifiers play critical roles in normal physiological processes and human pathogenesis. The major enhancer regulator paralogs MLL3 and MLL4 (MLL3/4) belong to the lysine methyltransferase 2 (KMT2) family, which catalyzes the methylation of lysine 4 on histone H3 (H3K4me). MLL3/4 are required for enhancer activation and are essential for mammalian development and stem cell differentiation. Recent studies have linked MLL3/4 with different metabolic pathways in the context of stem cell self-renewal and cancer cell growth; however, the underlying mechanisms remain elusive. Here, we utilize Seahorse extracellular flux analysis, stable isotope tracing, stem cell biology techniques, and transcriptomic analysis to investigate the functional relationship of MLL3/4, cellular respiration, and stem cell differentiation. Our results indicate that the loss of MLL3/4 impairs glycolytic activity and mitochondrial respiration in murine embryonic stem cells by downregulating the rate-limiting glycolytic enzyme Hexokinase 2 (HK2) and impairing the function of the Alpha-ketoglutarate dehydrogenase (OGDH) complex. Furthermore, simultaneously overexpression of HK2 and OGDH rescues defects in both cellular respiration and differentiation caused by MLL3/4 loss. Taken together, our study reveals a novel mechanism by which epigenetic machineries such as MLL3/4 govern the differentiation of pluripotent stem cells and facilitates the understanding of disease pathogenesis driven by enhancer malfunction.

2
Dissecting oligogenic and polygenic indirect genetic effects through the lens of neighbor genotypic identity

Sato, Y.; Hamazaki, K.

2026-04-03 genetics 10.64898/2026.03.31.715746 medRxiv
Top 0.2%
4.7%
Show abstract

Individual phenotypes often depend on the genotypes of other individuals within a group. These phenomena are termed indirect genetic effects (IGEs) and have been distinguished from direct genetic effects (DGEs) using quantitative genetic models. Recent studies have utilized high-resolution polymorphism data to enable genomic prediction (GP) and genome-wide association study (GWAS) of IGEs, but unified methods remain limited. Here we integrate polygenic and oligogenic IGEs using a multi-kernel mixed model incorporating two random effects with a single covariance parameter. Underlying this implementation, the Ising model of ferromagnetics enabled us to simplify locus-wise and background IGEs for GWAS and GP, respectively. Our simulations demonstrated that, while the previous and present models exhibited similar performance, the present model can infer a trade-off between DGEs and IGEs. By applying this method to three species of woody plants, we found evidence for intergenotypic competition in aspen and apple trees, but limited evidence in climbing grapevines. Based on GWAS, we also detected significant variants associated with the competitive IGEs on the apple trunk growth. Our study offers a flexible implementation for GWAS/GP of IGEs, thereby providing an effective tool to dissect the genetic architecture of group performance.

3
Ancient Ryukyu Jomon contributed to past and current genetic structure of Japanese populations

Matsunami, M.; Kawai, Y.; Speidel, L.; Koganebuchi, K.; Takigami, M.; Kakuda, T.; Adachi, N.; Kameda, Y.; Katagiri, C.; Shinzato, T.; Shinzato, A.; Takenaka, M.; Doi, N.; NCBN Controls WGS Consortium, ; Bird, N.; Hellenthal, G.; Yoneda, M.; Omori, T.; Ozaki, H.; Sakamoto, M.; Kinoshita, N.; Imamura, M.; Maeda, S.; Shinoda, K.-i.; Kanzawa-Kiriyama, H.; Kimura, R.

2026-04-07 evolutionary biology 10.64898/2026.04.03.712818 medRxiv
Top 0.3%
3.8%
Show abstract

Characterized by the earliest use of pottery, the Jomon culture was a unique Neolithic culture that spread throughout the Japanese Archipelago. Previous archaeological evidence suggests that Jomon hunter-gatherers colonized the southernmost islands, the Ryukyu Archipelago, by approximately 7,000 years before present (YBP). However, genetic characteristics of the Ryukyu Jomon population and its contribution to the modern population have not been elucidated yet. In this study, we newly sequenced 273 modern and 25 ancient (6,700-900 YBP) whole genomes collected across the Ryukyu Archipelago. Our analysis demonstrated a genetic differentiation between the Hondo (Japanese mainland) and Ryukyu Jomon, dating back to [~]6,900 YBP. After the divergence from the Hondo Jomon, the Ryukyu Jomon experienced severe bottlenecks, with an effective population size of [~]2,000. Admixture between the Ryukyu Jomon and migrants from the historic Hondo population occurred [~]1,000 YBP, which corresponds to the widespread adoption of iron tools and agriculture in the Central Ryukyus. Different demographic histories between modern Hondo and Ryukyu populations resulted in different rates of Jomon ancestry in these populations. By providing a new perspective on the peopling of the Ryukyu Archipelago, this study significantly enhances our understanding of cultural transitions in the region.

4
GLIS3 is a key regulator of astrocyte differentiation in human neural stem cells

Pradhan, T.; Kang, H. S.; Jeon, K.; Grimm, S. A.; Park, K.-y.; Jetten, A. M.

2026-04-04 developmental biology 10.64898/2026.04.02.716227 medRxiv
Top 0.3%
3.6%
Show abstract

Astrocytes play a key role in neuronal homeostasis and in various neural disorders. The generation of astrocytes from neural progenitor cells (NPCs) and its functions are under a complex control of several signaling networks and transcription factors. In this study, we demonstrate that the transcription factor, GLIS similar 3 (GLIS3), which has been implicated in several neurodegenerative diseases, is highly expressed in astrocytes, and is required for the efficient differentiation of human NPCs into astrocytes. Loss of GLIS3 function greatly impairs astrocytes differentiation, resulting in reduced expression of astrocyte markers, whereas expression of exogenous GLIS3 restores the induction of astrocyte specific genes indicating a critical role for GLIS3 in astrocyte differentiation. Integrated transcriptomic and cistromic analyses revealed that GLIS3 directly regulates the transcription of several astrocyte-associated genes, including GFAP, SLC1A2, NFIA, and ATF3, in coordination with lineage-determining factors, such as STAT3, NFIA, and SOX9. We hypothesize that GLIS3 dysfunction disrupts this transcriptional network thereby contributing to astrocyte-associated neurological disorders. Identification of GLIS3 as a key regulator of astrocyte differentiation and gene expression will advance our understanding of its role in neurodegenerative diseases and may provide a new therapeutic target.

5
Comprehensive bioinformatic analysis reveals novel potential diagnostic biomarkers associated with monocytes in osteoporosis

Qin, X.; Wen, B.; He, P.; Chen, Z.; Tan, S.; Mao, Z.

2026-03-24 genetics 10.64898/2026.03.20.713320 medRxiv
Top 0.5%
2.8%
Show abstract

Osteoporosis affects millions of women globally. In this study, we applied bioinformatics methods to screen for novel diagnostic biomarkers of osteoporosis in women using the GSE62402 and GSE56814 datasets. PCSK5, ZNF225, and H1FX were used to construct a diagnostic model. ROC, calibration, and decision curve analyses were performed to assess the diagnostic performance on the training (GSE56814) and external (GSE56815) datasets. The expression level of model genes was validated in GEO datasets. Furthermore, five transcription factors (ETS1, NOTCH1, MAZ, ERG, and FLI1) were identified as common upstream regulators of model genes. PCSK5, ZNF225, and H1FX serve as novel diagnostic biomarkers, providing new insights into the pathogenesis of and treatment strategies for osteoporosis in women.

6
Transmission dynamics of the COVID-19 pandemic across the emerging variants in mainland China: a hypergraph-based spatiotemporal modeling study

Wang, Y.; WANG, D.; Lau, Y. C.; Du, Z.; Cowling, B. J.; Zhao, Y.; Ali, S. T.

2026-04-17 public and global health 10.64898/2026.04.16.26351004 medRxiv
Top 0.6%
2.4%
Show abstract

Mainland China experienced multiple waves of COVID19 pandemic during 2020 2022, driven by emerging variants and changes in public health and social measures (PHSMs). We developed a hypergraph-based Susceptible Vaccinated Exposed Infectious Recovered Susceptible (SVEIRS) model to reconstruct epidemic dynamics across 31 provinces, capturing transmission heterogeneity associated with clustered contacts. We assessed key characteristics of transmission at national and provincial levels during four outbreak periods: initial, localized predelta, Delta, and widespread Omicron, which accounted for 96.7% of all infections. We found significant diversity in transmission contributions across cluster sizes, with a small fraction of larger clusters responsible for a disproportionate share of infections. Counterfactual analyses showed that reducing clustersize heterogeneity, while holding overall exposure constant, could have lowered national infections by 11.70 to 30.79%, with the largest effects during Omicron period. Ascertainment rates increased over time but remained spatially heterogeneous with a range: (14.40, 71.93)%. Population susceptibility declined following mass vaccination (to 42.49% in Aug 2021, nationally) and rebounded (to 89.89% in Nov 2022) due to waning immunity with variations across the provinces. Effective reproduction numbers displayed marked temporal and spatial variability, with higher estimates during Omicron. Overall, these results highlight critical role of group contact heterogeneity in shaping epidemic dynamics.

7
FOXO3 regulated MIR503HG safeguards cellular quiescence by modulating PI3K/Akt pathway via miR-508/PTEN axis

Jathar, S. R.; Srivastava, J.; Dongardive, V.; Tripathi, V.

2026-03-28 cell biology 10.64898/2026.03.27.714688 medRxiv
Top 0.6%
2.3%
Show abstract

Long noncoding RNAs (LncRNAs) have emerged as a class of important regulatory ncRNAs and are known to fine-tune numerous cellular processes including proliferation, differentiation and development; however, their role in quiescence still remains largely unexplored. A miRNA host gene lncRNA, MIR503HG, has been reported to play important role in cancer development. Here, we demonstrate the role of MIR503HG lncRNA in regulating cellular quiescence. MIR503HG displays elevated levels in human diploid fibroblasts induced to undergo quiescence. Depletion of MIR503HG in HDFs affects the entry of cells into quiescence but has no effect on cell cycle progression, suggesting its role in quiescence attainment and/or maintenance. Additionally, MIR503HG depletion led to a drastic decrease in the levels of miR508 target, PTEN with a concomitant increase in pAkt levels, indicating its role in negative regulation of miR508. Further, we demonstrate that the lncRNA MIR503HG regulates PTEN levels by acting as a ceRNA for miR508 to maintain cellular quiescence. Our studies illustrate that MIR503HG can function synergistically with miR503 to maintain cells under quiescence and both the miRNA-HG and the miRNA encoded by its gene locus synergistically control the same biological process in different ways by regulating different downstream genes.

8
Box H/ACA snoRNP regulates lipid storage through insulin signaling pathway in Drosophila melanogaster

Yang, H.; Zhao, L.; Zhou, X.; Li, X.; Huang, X.; Tian, Y.

2026-04-01 genetics 10.64898/2026.03.30.715344 medRxiv
Top 0.6%
2.1%
Show abstract

Text abstractsLipid homeostasis is essential for organismal physiology, and its disruption contributes to metabolic disorders. Using an unbiased genetic modifier screen in Drosophila, we identified GAR1, a core component of the box H/ACA small nucleolar ribonucleoprotein complex, as a pivotal regulator of systemic lipid storage. We show that the H/ACA snoRNP complex is essential for maintaining lipid droplet morphology in adipose tissue and preventing ectopic fat accumulation. Moreover, null mutants of Gar1 or Dkc1 exhibit severe developmental defects, including reduced body size and larval lethality. RNA-seq analysis revealed that Gar1 dysfunction triggered widespread alternative splicing defects, specifically targeting key transcripts within the insulin signaling cascade, including chico, Pi3K92E, sgg, and Lip4. Furthermore, knockdown of Gar1 impaired insulin signaling, as evidenced by the reduced membrane localization of the tGPH fluorescence. Genetic epistasis further positions GAR1 upstream of the lin-28/foxo axis, as knocking down lin-28 or foxo fully rescues the lipometabolic defects in GAR1-deficient animals. These findings reveal a previously unrecognized link between the snoRNP machinery and metabolic process, establishing the box H/ACA complex as an important coordinator that integrates RNA processing with insulin-mediated nutrient sensing to ensure developmental and lipid homeostasis. Article summaryLipid metabolism is tightly controlled by multiple factors. To find new regulators, the authors performed a genetic screen and identified a small nucleolar protein GAR1 participate in fat storage and larval development. They demonstrated a critical role of box H/ACA snoRNP complex in modulating alternative splicing and balancing insulin cascade. Blocking two insulin-related genes reversed the lipid defects caused by Gar1 loss. These findings revealed the box H/ACA complex integrates RNA processing with insulin-mediated nutrient sensing to ensure developmental and lipid homeostasis, offering a perspective for understanding the metabolic regulation network.

9
Spatial genome organization in nematodes with programmed DNA elimination

Simmons, J. R.; Xue, T.; McCord, R. P.; Wang, J.

2026-03-29 genomics 10.1101/2025.10.23.684251 medRxiv
Top 0.7%
2.1%
Show abstract

Programmed DNA elimination (PDE) is a notable exception to genome integrity, characterized by significant DNA loss during development. In many nematodes, PDE is initiated by DNA double-strand breaks (DSBs), which lead to chromosome fragmentation and subsequent DNA loss. However, the mechanism of nematode programmed DNA breakage remains largely unclear. Interestingly, in the human and pig parasitic nematode Ascaris, no conserved motif or sequence structures are present at chromosomal breakage regions (CBRs), suggesting the recognition of CBRs may be sequence-independent. Using Hi-C, we revealed that Ascaris CBRs engage in three-dimensional (3D) interactions before PDE, indicating that physical contacts between break regions may contribute to the PDE process. The 3D interactions are established in both Ascaris male and female germlines, demonstrating inherent genome organization associated with the CBRs and to-be-eliminated sequences. In contrast, in the unichromosomal horse parasite Parascaris univalens, transient pairwise interactions between neighboring CBRs that will form the ends of future somatic chromosomes were observed only during PDE. Intriguingly, we found that Ascaris PDE, which converts 24 germline chromosomes into 36 somatic ones, induces specific compartmentalization changes. Remarkably, Parascaris PDE generates the same set of 36 somatic chromosomes, and the 3D compartment changes following PDE are consistent between the two species. Overall, our findings suggest that CBRs spatially demarcate the retained and eliminated DNA and may contribute to their spatial organization during Ascaris PDE. We also demonstrated that the 3D genome reorganization of the somatic chromosomes in these nematodes following PDE is evolutionary and developmentally conserved.

10
Identification of feeding apparatus components in a heterotrophic marine flagellate

Clifford, G.; Taylor, S. J. P.; Ishii, M.; Cisneros-Soberanis, F.; Akiyoshi, B.

2026-03-31 cell biology 10.64898/2026.03.30.714256 medRxiv
Top 0.7%
2.0%
Show abstract

Acquiring nutrients is a fundamental biological process of all organisms, playing crucial roles in ecological sustainability. Diplonemids are highly abundant heterotrophic unicellular flagellates that are widespread in the worlds ocean. They have a highly complex microtubule-based feeding apparatus (cytostome-cytopharynx complex) located adjacent to the deep flagellar pocket from which two flagella emerge from parallel basal bodies. The apical papilla is a tongue-shaped structure unique to diplonemids that connects the cytopharynx and the flagellar pocket, the latter of which is formed by reinforcing microtubules (MTR) and two flagellar roots called intermediate and dorsal roots. Here we report identification of 17 proteins that localize at the feeding apparatus or flagellar apparatus in Diplonema papillatum. Using ultrastructure expansion microscopy, we show that Mad2 and its interaction partner MBP65 localize at the MTR, intermediate root, and dorsal root. Homologs of proteins that associate with the flagellar apparatus in Trypanosoma brucei (PFR2, KMP11, BILBO1) localize at the feeding apparatus in D. papillatum. We also identify proteins that localize at the apical papilla, MTR, parallel microtubule loop, or cytopharynx. By discovering components of the feeding apparatus for the first time in diplonemids, this work forms the foundation to understand molecular mechanisms of the feeding apparatus in these highly abundant marine plankton.

11
The results of Transcriptome-wide Mendelian Randomization (TWMR) in large-scale populations can directly validate, across scales, the results of causal inference from deep learning combined with double machine learning on single-cell transcriptomes of human samples.

ye, w.; Jiang, X.; Shen, F.

2026-03-19 rheumatology 10.64898/2026.03.16.26348532 medRxiv
Top 0.7%
1.9%
Show abstract

ObjectiveAiming at the core problems prevalent in biomedical research, including the "translational distance", the difficulty in aligning cross-scale studies, and the lack of direct validation of single-cell systems biology models in human samples, this study aims to verify whether the results of transcriptome-wide Mendelian randomization (TWMR) based on large-scale populations are consistent with the causal inference results of deep learning combined with double machine learning (DML) using single-cell transcriptome data from human samples, to clarify whether statistical biology and systems biology can converge to the same biological truth, and provide methodological support for mechanism dissection and precision medicine research of complex diseases such as rheumatoid arthritis (RA). MethodsThis study integrated multi-omics data to conduct a two-stage causal inference and cross-scale validation analysis. In the first stage, based on the summary statistics of RA genome-wide association study (GWAS) from 456,348 individuals of European ancestry in the UK Biobank (UKB), and cis-expression quantitative trait locus (cis-eQTL) data from 31,684 individuals in the eQTLGen Consortium, a two-sample Mendelian randomization approach was adopted. Transcriptome-wide causal effect analysis was performed using the inverse-variance weighted (IVW) method, MR Egger regression, and weighted median method, and gene-level causal effect values were obtained after strict quality control and multiple testing correction. In the second stage, based on single-cell RNA sequencing (scRNA-seq) data from RA patients and healthy controls (RA group: 11 samples, 211,867 cells; Healthy control group: 38 samples, 456,631 cells), after preprocessing via the Seurat pipeline, batch effect correction, and cell type annotation, a hierarchical deep neural network was constructed to complete feature compression of high-dimensional expression data, and the DML framework was used to estimate the causal effects of genes on RA disease status. Finally, Pearson correlation analysis was performed to conduct cell type-specific cross-scale validation of gene-level causal effect values obtained by the two methods, and the validated model was used to quantify the causal effects of 16 RA-related pathways from the Reactome database. ResultsThis study confirmed that the gene causal effect values obtained from large-scale population TWMR analysis were significantly correlated with those calculated by the deep learning combined with DML model based on single-cell transcriptome data. Among them, the correlation was extremely significant (p<0.001) in core naive B cells (r=0.202, p=3.2e-05, n=414) and core naive CD4 T cells (r=0.102, p=0.037, n=412). The validated DML model successfully quantified the cell type-specific causal effect values of 16 RA-related signaling pathways. ConclusionStatistical biology and systems biology can converge to the same biological truth. The cross-scale consistency between the two can significantly shorten the "translational distance" in biomedical research, and realizes the direct validation of the single-cell systems biology causal model of human samples based on large-scale population genetic data, getting rid of the excessive dependence on animal/cell experimental models in traditional research. This research paradigm not only provides a new path for mechanism dissection and therapeutic target screening of complex diseases such as RA, but also provides a feasible solution for rare disease research to break through the limitation of GWAS sample size, and lays an important theoretical and methodological foundation for constructing standardized systems biology models of human complex diseases and promoting the development of precision medicine.

12
Single-cell lung eQTL dataset of Asian never-smokers highlights the roles of alveolar cells in lung cancer etiology

Luong, T.; Yin, J.; Li, B.; Shin, J. H.; Sisay, E.; Mikhail, S.; Qin, F.; Anyaso-Samuel, S.; Kane, A.; Golden, A.; Liu, J.; Lee, C. H.; Zhang, Z. E.; Chang, Y. S.; Byun, J.; Han, Y.; Landi, M. T.; Mancuso, N.; Banovich, N. E.; Rothman, N.; Amos, C.; Lan, Q.; Yu, K.; Zhang, T.; Long, E.; Shi, J.; Lee, J. G.; Kim, E. Y.; Choi, J.

2026-03-27 genetics 10.64898/2026.03.26.714500 medRxiv
Top 0.8%
1.8%
Show abstract

Single-cell expression quantitative trait loci (sc-eQTL) analyses are powerful in identifying context-specific susceptibility genes from genome-wide association studies (GWAS) loci. However, few studies have comprehensively investigated cells of lung cancer origin in non-European populations. Here, we built a lung sc-eQTL dataset from 129 Korean women never-smokers with epithelial cell enrichment. eQTL mapping identified 2,229 genes with an eQTL in 33 cell types, including East Asian-specific findings when compared to predominantly European datasets. Integration with single-cell chromatin accessibility data demonstrated an enrichment of cell-type specific eQTLs in cell-type matched candidate enhancers, while shared eQTLs were more frequently found near promoters. Colocalization and transcriptome-wide association study unveiled 36 susceptibility genes from 22 cell types in 22 lung cancer loci, including 10 loci not achieving genome-wide significance in prior GWAS. Around 47% of these genes were from cells of the alveoli, underscoring their importance, especially in lung adenocarcinoma (LUAD) susceptibility. Focusing on the trajectory of alveolar epithelial cell regeneration, we detected 785 cell-state-interacting QTLs, which overlapped with 28% (10) of the identified susceptibility genes. Finally, we experimentally validated East Asian-and alveolar type 2 cell-specific eQTL of TCF7L2 underlying East Asian LUAD locus, 10q25.2. Consistent with its role as a Wnt/{beta}-catenin effector, TCF7L2 displayed significant effect on lung adenocarcinoma cell growth. Our data highlighted context-specific susceptibility genes, especially from alveolar cells of lung, contributing to lung cancer etiology.

13
Chromosome-scale genome of the woody oilseed crop sacha inchi elucidates the molecular basis of alpha-linolenic acid biosynthesis and triacylglycerol accumulation in seeds

Pan, B.-Z.; Zhang, X.; Hu, X.-D.; Fu, Q.; Chen, M.-S.; Tao, Y.-B.; Niu, L.-J.; He, H.; Shen, Y.; Cheng, Z.; Lang, T.; Liu, C.; Xu, Z.-F.

2026-03-20 genomics 10.64898/2026.03.18.712556 medRxiv
Top 0.8%
1.8%
Show abstract

Sacha inchi (Plukenetia volubilis L.) is an emerging woody oilseed crop prized for its high alpha-linolenic acid (ALA) content. Despite its nutritional and economic value, the lack of high-quality genomic resources has hindered genetic improvement and the elucidation of its unique polyunsaturated fatty acid and lipid biosynthetic pathways. In this study, we report a high-quality, chromosome-scale genome assembly of sacha inchi with a total length of 710.62 Mb, integrated from Illumina, PacBio, and chromosome conformation capture (Hi-C) technology. The genome harbors 37,570 protein-coding genes, and 379.86 Mb (53.45%) of repetitive sequences. Phylogenomic analysis reveals that sacha inchi diverged from its closest relative Ricinus communis, [~] approximately 36.2 million years ago. Comparative genomics indicates that sacha inchi experienced only ancient whole genome duplication events. To elucidate the mechanisms governing ALA biosynthesis and triacylglycerol (TAG) accumulation in sacha inchi seeds, we performed temporal transcriptome profiling across six seed development stages. Our findings demonstrate that high TAG content is primarily driven by the sustained expression of biosynthetic genes and low activity of degradation genes during mid-to-late seed development. Notably, while genes encoding stearoyl-ACP desaturases (SADs) maintain the precursor pool, the expression of genes encoding fatty-acid desaturase 2 (FAD2) and fatty-acid desaturase 3 (FAD3) is positively correlated with the final accumulation of C18:2 and C18:3 fatty acids. We also identified lncRNAs as potential epigenetic regulators of these key pathways. This high-quality genome provides a critical foundation for elucidating the molecular mechanisms of seed growth and development in sacha inchi.

14
Transposons Triggered Dynamic Evolution of MKK3 Gene, a Key Regulator for Seed Dormancy in Barley

Tressel, L. G.; Caspersen, A. M.; Walling, J. G.; Gao, D.

2026-03-25 plant biology 10.64898/2026.03.23.713676 medRxiv
Top 0.8%
1.8%
Show abstract

Barley (Hordeum vulgare L.) is an important crop in the world and its seed dormancy is primarily controlled by a Mitogen-Activated Protein Kinase Kinase 3 (MKK3) gene. Although kinase activity of MKK3 and its roles in barley post-domestication have been widely studied, the pre-domestication evolution of MKK3 and the spread of nondormant alleles among global barley varieties remain largely unexplored. In this study, we analyzed MKK3 sequences in barley and its wild progenitor (H. spontaneum) and identified two polymorphic miniature inverted-repeat transposable elements (MITEs). Comparative analyses indicated that the insertions/excision of the MITEs predated the current estimates of barley domestication. Examination of the barley pangenomes coupled with droplet digital (dd) PCR revealed extensive copy number variation of MKK3 and suggested that transposons likely drove tandem amplification of the MKK3 gene on chromosome 5H. Additionally, approximately 1-Kb MKK3 sequences were found on chromosomes 1H and 6H. Further analysis indicated that these short MKK3 sequences were captured by a CACTA transposon that also contained fragments from four other expressed genes. The acquisition of MKK3 was estimated to be between 1.9-2.5 million years ago. Together, these findings illuminate the dynamic pre-domestication evolution of the MKK3 gene and suggest three independent origins of highly nondormant barley worldwide including a unique lineage predominant in Ethiopian germplasm. This study reveals the pivotal roles of transposons in MKK3 evolution and provide helpful information for understanding the complex history of MKK3 gene in barley and also for improving preharvest sprouting (PSH) tolerant varieties under distinct natural conditions.

15
The metabolome and proteome of stem cell-derived human primordial germ cells: a multi-omics approach

Vaz Santos, M.; Schomakers, B. V.; Llobet Ayala, M.; Jamali, T.; van Weeghel, M.; van Pelt, A. M. M.; Mulder, C. L.; Hamer, G.

2026-04-02 developmental biology 10.64898/2026.03.31.715517 medRxiv
Top 0.9%
1.7%
Show abstract

Primordial germ cells (PGCs) are the population of cells that, in the human embryo, specify day 12 post-fertilization, and form the precursor cells for the future egg or sperm cells. Although in vitro differentiation of PGCs from human stem cells has been achieved, these primordial germ cell-like cells (hPGCLCs) fail to further mature. The reason for this is unclear. Previous studies in mice revealed that several specific metabolic changes occur during the maturation of these cells, which are essential for their developmental progress. However, very little is known about the metabolic profile of human primordial germ cells. In the severe scarcity of human PGCs, hPGCLCs serve as a research model to study PGC formation. To investigate this, we differentiated hPGCLCs using induced-pluripotent stem cells and performed a mass spectrometry analysis to establish their metabolome and proteome. These cells revealed distinct metabolic profile, with changes particularly at the proteome level. This included a shift between canonical and non-canonical citric acid cycle in hPGCLC, downregulation of late-stage glycolysis and reduction of nucleotide de novo synthesis. By providing an integrative map of these metabolic networks, we aim to provide insight on the influence of metabolism on human PGC development that could help improve methods for in vitro differentiation and maturation hPGCLCs.

16
Clarified an rDNA Gene Unit Pattern with (CTTT)n and (CT)n Microsatellites Aggregation Ahead of and Behind the Gene in Human Genome

Shen, J.; Tang, S.; Xia, Y.; Qin, J.; Xu, H.; Tan, Z.

2026-03-24 genetics 10.64898/2026.03.22.713381 medRxiv
Top 0.9%
1.7%
Show abstract

BackgroundConventional models of human ribosomal DNA (rDNA) array organization have historically depended on transcription-centric boundaries, partitioning the unit into a [~]13 kb rDNA transcription region and a monolithic [~]31 kb intergenic spacer (IGS). While our previous identification of Duplication Segment Units (DSUs) mapped these arrays based on an intuitive analysis of the microsatellite density landscape of the complete reference human genome, our present deep mining of this landscape has revealed a more accurate rDNA Gene Unit Pattern. Methods & ResultsIn this study, we conducted a deep mining analysis of our previously established microsatellite density landscape of the T2T-CHM13 assembly, focusing specifically on nucleolar organizing regions (NORs). We suggest a more accurate rDNA Gene Unit Pattern containing a (CTTT)n microsatellite aggregation ahead of the rDNA gene and a (CT)n microsatellite aggregation behind the gene, rather than a pattern featuring an IGS region inserted between two rDNA genes. ConclusionsA correct rDNA gene pattern of the human genome probably includes a (CTTT)n microsatellite aggregation ahead of the gene and a (CT)n microsatellite aggregation behind it, which possibly constitute cis- and trans-regulating regions; the (CTTT)n and (CT)n microsatellite aggregations may provide two different local stable DNA structures for regulatory protein binding.

17
A digitally-enabled, stage-based community intervention for maternal and child health: Experimental evidence from rural China

Chen, Y.; Wu, Y.; Weber, A.; Medina, A.; Guo, Y.; Balakrishnan, S.; Zhang, H.; Zhou, H.; Rozelle, S.; Darmstadt, G. L.; Sylvia, S.

2026-03-30 public and global health 10.64898/2026.03.27.26349570 medRxiv
Top 0.9%
1.7%
Show abstract

Comprehensive and responsive interventions are increasingly prioritized to address the diverse and evolving health challenges faced by mothers and children during the first 1,000 days of life. However, evidence remains limited on how such interventions can be operationalized in low-resource settings without overstretching frontline health workers. We developed a comprehensive yet flexible community-based intervention, the Healthy Future program, which integrates a stage-based maternal and child health curriculum with mHealth-enabled infrastructure to deliver targeted, stage-based support through home visits in low-resource settings. We evaluated its impact through a cluster-randomized controlled trial across 119 rural townships in China. The program demonstrated improvements across multiple health, behavioral, and intermediate outcomes, including young child feeding practices, caregiving knowledge, maternal mental health, and perceived social support. Overall, this study illustrates a move beyond stand-alone interventions toward a scalable, multidimensional delivery model capable of providing comprehensive, flexible, and timely support to mothers and children in low-resource communities while remaining feasible for large-scale implementation.

18
Identification, evolutionary history and characteristics of orphan genes in root-knot nematodes

Seckin, E.; Colinet, D.; Bailly-Bechet, M.; Seassau, A.; Bottini, S.; Sarti, E.; Danchin, E. G.

2026-04-11 bioinformatics 10.64898/2025.12.19.695360 medRxiv
Top 1.0%
1.7%
Show abstract

Orphan genes, lacking homologs in other species, are systematically found across genomes. Their presence may result from extensive divergence from pre-existing genes or from de novo gene birth, which occurs when a gene emerges from a previously non-genic region. In this study, we identified orphan genes in the genomes of globally distributed plant-parasitic nematodes of the genus Meloidogyne and investigated their origins, evolution, and characteristics. Using a comparative genomics framework across 85 nematode species, we found that 18% of Meloidogyne genes are genus-specific, transcriptionally supported orphans. By combining ancestral sequence reconstruction and synteny-based approaches, we inferred that 20% of these orphan genes originated through high divergence, while 18% likely emerged de novo. Proteomic and translatomic evidence confirmed the translation of a subset of these genes, and feature analyses revealed distinctive molecular signatures, including shorter length, signal peptide enrichment, and a tendency for extracellular localization. These findings highlight orphan genes as a substantial and previously underexplored component of the Meloidogyne genome, with potential roles in their worldwide parasitism.

19
Uncovering zebrafish embryonic proteome dynamics across 16 time points during the first 24 hours of development

Fang, F.; Poulos, W.; Yue, y.; Li, K.; Cibelli, J.; Liu, X.; Sun, L.

2026-03-26 developmental biology 10.64898/2026.03.24.713983 medRxiv
Top 1%
1.7%
Show abstract

Defining how proteins change over developmental time is amenable to studies deciphering regulatory genetic networks in vertebrate development, biology, and pharmacology. In an approach toward such quantitative studies of dynamic network behavior, we produced an atlas using the mass spectrometry-based method to investigate protein expression changes across 16 time points from the zygote to the early pharyngula stage zebrafish embryos. We systematically summarize 8 clusters for interrogating changes in protein expression associated with the development of zebrafish embryos. Specifically, we identified a class of zinc finger-related transcription factors primarily located on the long arm of chromosome 4, which are highly expressed during zygotic genome activation. Furthermore, we highlight the power of this analysis to assign developmental stage-specific expression information to chromosomes and tissues. Time-resolved analyses reveal significant discordance between differential transcript and protein expression, whereas no time lag is observed for proteins involved in stable and fundamental biological processes, such as metabolism (e.g., Ppt2a and Gatm), cytoskeletal organization (e.g., Col18a1), and the translation machinery (e.g., Eif4enif1). This atlas offers high-resolution and in-depth molecular insights into zebrafish development, providing a resource for developmental biologists to generate hypotheses for functional analysis of proteins during early vertebrate embryogenesis. HighlightsO_LIA global protein expression database with high time resolution is created for zebrafish embryos. C_LIO_LIDistinct patterns of protein expression correlate with biological processes. C_LIO_LITranscription factors have a burst of expression from the gastrulation stage. C_LIO_LIDevelopmental stage-specific protein expressions were assigned to chromosomes and tissues. C_LIO_LIHigh-resolution embryonic transcriptome and proteome datasets were compared and connected. C_LI

20
Integrative Identification and Characterization of PCOS-Associated lncRNAs From the Interface of Genetic Association, Transcriptomics, and Gene Structure Evolution

He, Z.; Li, Y.; Shkurat, T. P.; Butenko, E. V.; Derevyanchuk, E. G.; Lomteva, S. V.; Chen, L.; Lipovich, L.

2026-04-02 genomics 10.64898/2026.03.31.715548 medRxiv
Top 1%
1.5%
Show abstract

BackgroundPolycystic ovary syndrome (PCOS) is a prevalent endocrine disorder and a leading cause of female infertility, with complex genetic, metabolic, and hormonal etiologies. Long non-coding RNAs (lncRNAs) have emerged as important regulators of diverse biological processes, yet their roles in PCOS remain underexplored. Here, we identified and characterized PCOS differentially expressed gene-associated lncRNAs (PDEGAL) with an integrative approach combining expression data, genetic association, and evolutionary analysis. MethodsThirty-three PCOS-associated protein-coding genes were obtained from our prior study, and all their nearby and overlapping lncRNAs were annotated. These candidates were analyzed using UCSC Genome Browser-mapped annotations and datasets, including NCBI RefSeq, GENCODE, GTEx, GWAS SNPs, and conservation, as well as the FANTOM5 cap analysis of gene expression (CAGE) promoter data, to assess their expression, regulatory potential, genetic variant overlaps, and evolutionary conservation. ResultsTwenty-three PDEGALs (18 antisense to, and 5 sharing bidirectional promoters with, known PCOS-associated protein-coding genes) were identified. 17 PDEGALs contained GWAS SNPs with statistically significant disease associations, 9 of which were associated with PCOS-related traits. 5 PDEGALs demonstrated expression in the KGN granulosa cell model of PCOS. Key gene structure element (KGSE) analysis revealed that most PDEGALs are primate-specific. Integrating four criteria--GTEx expression, GWAS SNPs, FANTOM promoterome, and KGSE conservation--highlighted HELLPAR as the only lncRNA fulfilling all four, while five others--PGR-AS1, MTOR-AS1, ENSG00000265179, ENSG00000256218, and LOC105377276--fulfilled three of the four criteria. ConclusionsWe have systematically identified candidate PCOS regulatory lncRNAs with convergent genetic, expression, and evolutionary evidence. These results provide a framework for functional validation and highlight lncRNAs as potential biomarkers and therapeutic targets in PCOS that function by regulating their nearby and overlapping protein-coding genes.